Overview

Brought to you by YData

Dataset statistics

Number of variables29
Number of observations2169687
Missing cells18568423
Missing cells (%)29.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 GiB
Average record size in memory1.0 KiB

Variable types

DateTime2
Categorical5
Unsupported1
Numeric9
Text12

Alerts

CONTRIBUTING FACTOR VEHICLE 4 is highly overall correlated with CONTRIBUTING FACTOR VEHICLE 5High correlation
CONTRIBUTING FACTOR VEHICLE 5 is highly overall correlated with CONTRIBUTING FACTOR VEHICLE 4High correlation
NUMBER OF CYCLIST KILLED is highly overall correlated with NUMBER OF PEDESTRIANS KILLED and 1 other fieldsHigh correlation
NUMBER OF MOTORIST INJURED is highly overall correlated with NUMBER OF PERSONS INJUREDHigh correlation
NUMBER OF MOTORIST KILLED is highly overall correlated with NUMBER OF PERSONS KILLEDHigh correlation
NUMBER OF PEDESTRIANS KILLED is highly overall correlated with NUMBER OF CYCLIST KILLED and 1 other fieldsHigh correlation
NUMBER OF PERSONS INJURED is highly overall correlated with NUMBER OF MOTORIST INJUREDHigh correlation
NUMBER OF PERSONS KILLED is highly overall correlated with NUMBER OF CYCLIST KILLED and 2 other fieldsHigh correlation
NUMBER OF CYCLIST INJURED is highly imbalanced (92.0%) Imbalance
NUMBER OF CYCLIST KILLED is highly imbalanced (99.9%) Imbalance
CONTRIBUTING FACTOR VEHICLE 4 is highly imbalanced (90.9%) Imbalance
CONTRIBUTING FACTOR VEHICLE 5 is highly imbalanced (90.1%) Imbalance
BOROUGH has 670408 (30.9%) missing values Missing
ZIP CODE has 670677 (30.9%) missing values Missing
LATITUDE has 239855 (11.1%) missing values Missing
LONGITUDE has 239855 (11.1%) missing values Missing
LOCATION has 239855 (11.1%) missing values Missing
ON STREET NAME has 467859 (21.6%) missing values Missing
CROSS STREET NAME has 827830 (38.2%) missing values Missing
OFF STREET NAME has 1794202 (82.7%) missing values Missing
CONTRIBUTING FACTOR VEHICLE 2 has 344595 (15.9%) missing values Missing
CONTRIBUTING FACTOR VEHICLE 3 has 2013111 (92.8%) missing values Missing
CONTRIBUTING FACTOR VEHICLE 4 has 2133997 (98.4%) missing values Missing
CONTRIBUTING FACTOR VEHICLE 5 has 2159928 (99.6%) missing values Missing
VEHICLE TYPE CODE 2 has 428782 (19.8%) missing values Missing
VEHICLE TYPE CODE 3 has 2019068 (93.1%) missing values Missing
VEHICLE TYPE CODE 4 has 2135279 (98.4%) missing values Missing
VEHICLE TYPE CODE 5 has 2160231 (99.6%) missing values Missing
NUMBER OF PERSONS KILLED is highly skewed (γ1 = 32.98672025) Skewed
NUMBER OF PEDESTRIANS KILLED is highly skewed (γ1 = 41.28234763) Skewed
NUMBER OF MOTORIST KILLED is highly skewed (γ1 = 53.53005542) Skewed
COLLISION_ID has unique values Unique
ZIP CODE is an unsupported type, check if it needs cleaning or further analysis Unsupported
NUMBER OF PERSONS INJURED has 1654274 (76.2%) zeros Zeros
NUMBER OF PERSONS KILLED has 2166423 (99.8%) zeros Zeros
NUMBER OF PEDESTRIANS INJURED has 2047534 (94.4%) zeros Zeros
NUMBER OF PEDESTRIANS KILLED has 2168039 (99.9%) zeros Zeros
NUMBER OF MOTORIST INJURED has 1842231 (84.9%) zeros Zeros
NUMBER OF MOTORIST KILLED has 2168418 (99.9%) zeros Zeros

Reproduction

Analysis started2025-04-20 13:29:42.068171
Analysis finished2025-04-20 13:31:31.029049
Duration1 minute and 48.96 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

Distinct4672
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.6 MiB
Minimum2012-07-01 00:00:00
Maximum2025-04-15 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-04-20T15:31:31.084933image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:31.145255image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1440
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.6 MiB
Minimum2025-04-20 00:00:00
Maximum2025-04-20 23:59:00
Invalid dates0
Invalid dates (%)0.0%
2025-04-20T15:31:31.228301image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:31.293353image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BOROUGH
Categorical

Missing 

Distinct5
Distinct (%)< 0.1%
Missing670408
Missing (%)30.9%
Memory size133.1 MiB
BROOKLYN
479090 
QUEENS
402093 
MANHATTAN
333177 
BRONX
222053 
STATEN ISLAND
62866 

Length

Max length13
Median length9
Mean length7.4511775
Min length5

Characters and Unicode

Total characters11171394
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBROOKLYN
2nd rowBROOKLYN
3rd rowBROOKLYN
4th rowBRONX
5th rowBROOKLYN

Common Values

ValueCountFrequency (%)
BROOKLYN 479090
22.1%
QUEENS 402093
18.5%
MANHATTAN 333177
15.4%
BRONX 222053
 
10.2%
STATEN ISLAND 62866
 
2.9%
(Missing) 670408
30.9%

Length

2025-04-20T15:31:31.353363image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T15:31:31.542542image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
brooklyn 479090
30.7%
queens 402093
25.7%
manhattan 333177
21.3%
bronx 222053
14.2%
staten 62866
 
4.0%
island 62866
 
4.0%

Most occurring characters

ValueCountFrequency (%)
N 1895322
17.0%
O 1180233
10.6%
A 1125263
10.1%
E 867052
 
7.8%
T 792086
 
7.1%
R 701143
 
6.3%
B 701143
 
6.3%
L 541956
 
4.9%
S 527825
 
4.7%
Y 479090
 
4.3%
Other values (9) 2360281
21.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11171394
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 1895322
17.0%
O 1180233
10.6%
A 1125263
10.1%
E 867052
 
7.8%
T 792086
 
7.1%
R 701143
 
6.3%
B 701143
 
6.3%
L 541956
 
4.9%
S 527825
 
4.7%
Y 479090
 
4.3%
Other values (9) 2360281
21.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11171394
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 1895322
17.0%
O 1180233
10.6%
A 1125263
10.1%
E 867052
 
7.8%
T 792086
 
7.1%
R 701143
 
6.3%
B 701143
 
6.3%
L 541956
 
4.9%
S 527825
 
4.7%
Y 479090
 
4.3%
Other values (9) 2360281
21.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11171394
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 1895322
17.0%
O 1180233
10.6%
A 1125263
10.1%
E 867052
 
7.8%
T 792086
 
7.1%
R 701143
 
6.3%
B 701143
 
6.3%
L 541956
 
4.9%
S 527825
 
4.7%
Y 479090
 
4.3%
Other values (9) 2360281
21.1%

ZIP CODE
Unsupported

Missing  Rejected  Unsupported 

Missing670677
Missing (%)30.9%
Memory size77.6 MiB

LATITUDE
Real number (ℝ)

Missing 

Distinct128580
Distinct (%)6.7%
Missing239855
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean40.611136
Minimum0
Maximum43.344444
Zeros5346
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:31.605081image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40.596296
Q140.667454
median40.720434
Q340.769615
95-th percentile40.861942
Maximum43.344444
Range43.344444
Interquartile range (IQR)0.102161

Descriptive statistics

Standard deviation2.1419157
Coefficient of variation (CV)0.052742078
Kurtosis354.99951
Mean40.611136
Median Absolute Deviation (MAD)0.0513358
Skewness-18.881275
Sum78372670
Variance4.5878029
MonotonicityNot monotonic
2025-04-20T15:31:31.663611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5346
 
0.2%
40.861862 918
 
< 0.1%
40.696033 803
 
< 0.1%
40.608757 739
 
< 0.1%
40.8047 702
 
< 0.1%
40.798256 647
 
< 0.1%
40.759308 635
 
< 0.1%
40.675735 593
 
< 0.1%
40.6960346 587
 
< 0.1%
40.658577 556
 
< 0.1%
Other values (128570) 1918306
88.4%
(Missing) 239855
 
11.1%
ValueCountFrequency (%)
0 5346
0.2%
30.78418 1
 
< 0.1%
34.783634 1
 
< 0.1%
40.498947 1
 
< 0.1%
40.4989488 2
 
< 0.1%
40.4991346 1
 
< 0.1%
40.49931 1
 
< 0.1%
40.4994787 1
 
< 0.1%
40.499659 1
 
< 0.1%
40.499672 1
 
< 0.1%
ValueCountFrequency (%)
43.344444 1
 
< 0.1%
42.64154 1
 
< 0.1%
42.318317 1
 
< 0.1%
42.107204 1
 
< 0.1%
41.91661 1
 
< 0.1%
41.34796 1
 
< 0.1%
41.258785 1
 
< 0.1%
41.12615 5
< 0.1%
41.12421 1
 
< 0.1%
41.061634 2
 
< 0.1%

LONGITUDE
Real number (ℝ)

Missing 

Distinct99833
Distinct (%)5.2%
Missing239855
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean-73.721994
Minimum-201.35999
Maximum0
Zeros5346
Zeros (%)0.2%
Negative1924486
Negative (%)88.7%
Memory size16.6 MiB
2025-04-20T15:31:31.721838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-201.35999
5-th percentile-74.038086
Q1-73.974648
median-73.92693
Q3-73.86668
95-th percentile-73.76282
Maximum0
Range201.35999
Interquartile range (IQR)0.107968

Descriptive statistics

Standard deviation4.0013414
Coefficient of variation (CV)-0.054276088
Kurtosis372.95801
Mean-73.721994
Median Absolute Deviation (MAD)0.05265
Skewness15.556815
Sum-1.4227106 × 108
Variance16.010733
MonotonicityNot monotonic
2025-04-20T15:31:31.795544image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5346
 
0.2%
-73.89063 790
 
< 0.1%
-74.038086 740
 
< 0.1%
-73.91282 720
 
< 0.1%
-73.98453 718
 
< 0.1%
-73.89686 685
 
< 0.1%
-73.91243 658
 
< 0.1%
-73.94476 627
 
< 0.1%
-73.9112 598
 
< 0.1%
-73.9845292 587
 
< 0.1%
Other values (99823) 1918363
88.4%
(Missing) 239855
 
11.1%
ValueCountFrequency (%)
-201.35999 1
 
< 0.1%
-201.23706 105
< 0.1%
-89.13527 1
 
< 0.1%
-86.76847 1
 
< 0.1%
-79.61955 1
 
< 0.1%
-79.00183 1
 
< 0.1%
-76.2634 1
 
< 0.1%
-76.02163 1
 
< 0.1%
-74.742 7
 
< 0.1%
-74.25496 1
 
< 0.1%
ValueCountFrequency (%)
0 5346
0.2%
-32.768513 16
 
< 0.1%
-47.209625 3
 
< 0.1%
-73.66301 1
 
< 0.1%
-73.70055 2
 
< 0.1%
-73.700584 11
 
< 0.1%
-73.7005968 10
 
< 0.1%
-73.7006 2
 
< 0.1%
-73.70061 5
 
< 0.1%
-73.70071 4
 
< 0.1%

LOCATION
Text

Missing 

Distinct314263
Distinct (%)16.3%
Missing239855
Missing (%)11.1%
Memory size154.1 MiB
2025-04-20T15:31:32.018603image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length25
Median length24
Mean length22.72621
Min length10

Characters and Unicode

Total characters43857767
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique178274 ?
Unique (%)9.2%

Sample

1st row(40.62179, -73.970024)
2nd row(40.667202, -73.8665)
3rd row(40.683304, -73.917274)
4th row(40.709183, -73.956825)
5th row(40.86816, -73.83148)
ValueCountFrequency (%)
0.0 10692
 
0.3%
40.861862 918
 
< 0.1%
40.696033 803
 
< 0.1%
73.89063 790
 
< 0.1%
74.038086 740
 
< 0.1%
40.608757 739
 
< 0.1%
73.91282 720
 
< 0.1%
73.98453 718
 
< 0.1%
40.8047 702
 
< 0.1%
73.89686 685
 
< 0.1%
Other values (228402) 3842157
99.5%
2025-04-20T15:31:32.283294image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 4800580
10.9%
4 4162714
 
9.5%
. 3859664
 
8.8%
3 3654090
 
8.3%
0 3556631
 
8.1%
9 2813709
 
6.4%
8 2765641
 
6.3%
6 2734950
 
6.2%
5 2187840
 
5.0%
) 1929832
 
4.4%
Other values (6) 11392116
26.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43857767
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
7 4800580
10.9%
4 4162714
 
9.5%
. 3859664
 
8.8%
3 3654090
 
8.3%
0 3556631
 
8.1%
9 2813709
 
6.4%
8 2765641
 
6.3%
6 2734950
 
6.2%
5 2187840
 
5.0%
) 1929832
 
4.4%
Other values (6) 11392116
26.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43857767
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
7 4800580
10.9%
4 4162714
 
9.5%
. 3859664
 
8.8%
3 3654090
 
8.3%
0 3556631
 
8.1%
9 2813709
 
6.4%
8 2765641
 
6.3%
6 2734950
 
6.2%
5 2187840
 
5.0%
) 1929832
 
4.4%
Other values (6) 11392116
26.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43857767
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
7 4800580
10.9%
4 4162714
 
9.5%
. 3859664
 
8.8%
3 3654090
 
8.3%
0 3556631
 
8.1%
9 2813709
 
6.4%
8 2765641
 
6.3%
6 2734950
 
6.2%
5 2187840
 
5.0%
) 1929832
 
4.4%
Other values (6) 11392116
26.0%

ON STREET NAME
Text

Missing 

Distinct21730
Distinct (%)1.3%
Missing467859
Missing (%)21.6%
Memory size153.8 MiB
2025-04-20T15:31:32.418787image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length32
Median length32
Mean length28.981801
Min length2

Characters and Unicode

Total characters49322041
Distinct characters75
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7668 ?
Unique (%)0.5%

Sample

1st rowWHITESTONE EXPRESSWAY
2nd rowQUEENSBORO BRIDGE UPPER
3rd rowOCEAN PARKWAY
4th rowTHROGS NECK BRIDGE
5th rowBROOKLYN BRIDGE
ValueCountFrequency (%)
avenue 622661
 
15.8%
street 532561
 
13.6%
east 156789
 
4.0%
boulevard 129814
 
3.3%
west 117287
 
3.0%
parkway 78412
 
2.0%
road 69656
 
1.8%
expressway 67146
 
1.7%
island 32291
 
0.8%
queens 28427
 
0.7%
Other values (5459) 2093501
53.3%
2025-04-20T15:31:32.626074image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27652057
56.1%
E 3792469
 
7.7%
A 2027021
 
4.1%
T 1890880
 
3.8%
R 1731520
 
3.5%
N 1478967
 
3.0%
S 1463241
 
3.0%
U 1005240
 
2.0%
O 902527
 
1.8%
V 885702
 
1.8%
Other values (65) 6492417
 
13.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 49322041
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
27652057
56.1%
E 3792469
 
7.7%
A 2027021
 
4.1%
T 1890880
 
3.8%
R 1731520
 
3.5%
N 1478967
 
3.0%
S 1463241
 
3.0%
U 1005240
 
2.0%
O 902527
 
1.8%
V 885702
 
1.8%
Other values (65) 6492417
 
13.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 49322041
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
27652057
56.1%
E 3792469
 
7.7%
A 2027021
 
4.1%
T 1890880
 
3.8%
R 1731520
 
3.5%
N 1478967
 
3.0%
S 1463241
 
3.0%
U 1005240
 
2.0%
O 902527
 
1.8%
V 885702
 
1.8%
Other values (65) 6492417
 
13.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 49322041
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
27652057
56.1%
E 3792469
 
7.7%
A 2027021
 
4.1%
T 1890880
 
3.8%
R 1731520
 
3.5%
N 1478967
 
3.0%
S 1463241
 
3.0%
U 1005240
 
2.0%
O 902527
 
1.8%
V 885702
 
1.8%
Other values (65) 6492417
 
13.2%

CROSS STREET NAME
Text

Missing 

Distinct23740
Distinct (%)1.8%
Missing827830
Missing (%)38.2%
Memory size126.7 MiB
2025-04-20T15:31:32.729572image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length32
Median length31
Mean length22.28578
Min length1

Characters and Unicode

Total characters29904330
Distinct characters76
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7263 ?
Unique (%)0.5%

Sample

1st row20 AVENUE
2nd rowAVENUE K
3rd rowHENRY HUDSON RIVER
4th rowDECATUR STREET
5th rowEAST 43 STREET
ValueCountFrequency (%)
avenue 578077
 
19.5%
street 468679
 
15.8%
east 114406
 
3.9%
west 72276
 
2.4%
boulevard 70450
 
2.4%
road 56811
 
1.9%
place 34629
 
1.2%
parkway 27456
 
0.9%
3 19477
 
0.7%
park 18027
 
0.6%
Other values (5590) 1508188
50.8%
2025-04-20T15:31:32.920226image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14177815
47.4%
E 3023830
 
10.1%
T 1497841
 
5.0%
A 1471873
 
4.9%
R 1183523
 
4.0%
N 1109613
 
3.7%
S 1023762
 
3.4%
U 797667
 
2.7%
V 737444
 
2.5%
O 600727
 
2.0%
Other values (66) 4280235
 
14.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 29904330
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
14177815
47.4%
E 3023830
 
10.1%
T 1497841
 
5.0%
A 1471873
 
4.9%
R 1183523
 
4.0%
N 1109613
 
3.7%
S 1023762
 
3.4%
U 797667
 
2.7%
V 737444
 
2.5%
O 600727
 
2.0%
Other values (66) 4280235
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 29904330
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
14177815
47.4%
E 3023830
 
10.1%
T 1497841
 
5.0%
A 1471873
 
4.9%
R 1183523
 
4.0%
N 1109613
 
3.7%
S 1023762
 
3.4%
U 797667
 
2.7%
V 737444
 
2.5%
O 600727
 
2.0%
Other values (66) 4280235
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 29904330
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
14177815
47.4%
E 3023830
 
10.1%
T 1497841
 
5.0%
A 1471873
 
4.9%
R 1183523
 
4.0%
N 1109613
 
3.7%
S 1023762
 
3.4%
U 797667
 
2.7%
V 737444
 
2.5%
O 600727
 
2.0%
Other values (66) 4280235
 
14.3%

OFF STREET NAME
Text

Missing 

Distinct246307
Distinct (%)65.6%
Missing1794202
Missing (%)82.7%
Memory size87.7 MiB
2025-04-20T15:31:33.077285image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length40
Median length40
Mean length34.983336
Min length8

Characters and Unicode

Total characters13135718
Distinct characters84
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique193285 ?
Unique (%)51.5%

Sample

1st row61 Ed Koch queensborough bridge
2nd row1211 LORING AVENUE
3rd row344 BAYCHESTER AVENUE
4th row2047 PITKIN AVENUE
5th row480 DEAN STREET
ValueCountFrequency (%)
avenue 144519
 
11.6%
street 132144
 
10.6%
east 34848
 
2.8%
west 25213
 
2.0%
boulevard 22997
 
1.8%
road 17137
 
1.4%
ave 8546
 
0.7%
lot 7881
 
0.6%
st 7314
 
0.6%
parking 7267
 
0.6%
Other values (28055) 839100
67.3%
2025-04-20T15:31:33.306777image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7070993
53.8%
E 845169
 
6.4%
T 464863
 
3.5%
A 435893
 
3.3%
R 359949
 
2.7%
N 316356
 
2.4%
S 306999
 
2.3%
1 299444
 
2.3%
U 213501
 
1.6%
V 203631
 
1.6%
Other values (74) 2618920
 
19.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 13135718
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
7070993
53.8%
E 845169
 
6.4%
T 464863
 
3.5%
A 435893
 
3.3%
R 359949
 
2.7%
N 316356
 
2.4%
S 306999
 
2.3%
1 299444
 
2.3%
U 213501
 
1.6%
V 203631
 
1.6%
Other values (74) 2618920
 
19.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 13135718
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
7070993
53.8%
E 845169
 
6.4%
T 464863
 
3.5%
A 435893
 
3.3%
R 359949
 
2.7%
N 316356
 
2.4%
S 306999
 
2.3%
1 299444
 
2.3%
U 213501
 
1.6%
V 203631
 
1.6%
Other values (74) 2618920
 
19.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 13135718
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
7070993
53.8%
E 845169
 
6.4%
T 464863
 
3.5%
A 435893
 
3.3%
R 359949
 
2.7%
N 316356
 
2.4%
S 306999
 
2.3%
1 299444
 
2.3%
U 213501
 
1.6%
V 203631
 
1.6%
Other values (74) 2618920
 
19.9%

NUMBER OF PERSONS INJURED
Real number (ℝ)

High correlation  Zeros 

Distinct32
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.32193113
Minimum0
Maximum43
Zeros1654274
Zeros (%)76.2%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:33.342601image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum43
Range43
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.70984503
Coefficient of variation (CV)2.2049593
Kurtosis47.914741
Mean0.32193113
Median Absolute Deviation (MAD)0
Skewness4.1348047
Sum698484
Variance0.50387997
MonotonicityNot monotonic
2025-04-20T15:31:33.404608image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
0 1654274
76.2%
1 400107
 
18.4%
2 75273
 
3.5%
3 24667
 
1.1%
4 9105
 
0.4%
5 3498
 
0.2%
6 1455
 
0.1%
7 608
 
< 0.1%
8 273
 
< 0.1%
9 137
 
< 0.1%
Other values (22) 272
 
< 0.1%
ValueCountFrequency (%)
0 1654274
76.2%
1 400107
 
18.4%
2 75273
 
3.5%
3 24667
 
1.1%
4 9105
 
0.4%
5 3498
 
0.2%
6 1455
 
0.1%
7 608
 
< 0.1%
8 273
 
< 0.1%
9 137
 
< 0.1%
ValueCountFrequency (%)
43 1
 
< 0.1%
40 1
 
< 0.1%
34 1
 
< 0.1%
32 1
 
< 0.1%
31 1
 
< 0.1%
27 1
 
< 0.1%
25 1
 
< 0.1%
24 3
< 0.1%
23 1
 
< 0.1%
22 3
< 0.1%

NUMBER OF PERSONS KILLED
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct7
Distinct (%)< 0.1%
Missing31
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.0015532416
Minimum0
Maximum8
Zeros2166423
Zeros (%)99.8%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:33.437616image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.041600263
Coefficient of variation (CV)26.782866
Kurtosis1822.8072
Mean0.0015532416
Median Absolute Deviation (MAD)0
Skewness32.98672
Sum3370
Variance0.0017305819
MonotonicityNot monotonic
2025-04-20T15:31:33.484854image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 2166423
99.8%
1 3129
 
0.1%
2 84
 
< 0.1%
3 13
 
< 0.1%
4 4
 
< 0.1%
5 2
 
< 0.1%
8 1
 
< 0.1%
(Missing) 31
 
< 0.1%
ValueCountFrequency (%)
0 2166423
99.8%
1 3129
 
0.1%
2 84
 
< 0.1%
3 13
 
< 0.1%
4 4
 
< 0.1%
5 2
 
< 0.1%
8 1
 
< 0.1%
ValueCountFrequency (%)
8 1
 
< 0.1%
5 2
 
< 0.1%
4 4
 
< 0.1%
3 13
 
< 0.1%
2 84
 
< 0.1%
1 3129
 
0.1%
0 2166423
99.8%

NUMBER OF PEDESTRIANS INJURED
Real number (ℝ)

Zeros 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.058721373
Minimum0
Maximum27
Zeros2047534
Zeros (%)94.4%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:33.526611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum27
Range27
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.24833533
Coefficient of variation (CV)4.2290451
Kurtosis117.21302
Mean0.058721373
Median Absolute Deviation (MAD)0
Skewness5.4971306
Sum127407
Variance0.061670438
MonotonicityNot monotonic
2025-04-20T15:31:33.563993image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 2047534
94.4%
1 117672
 
5.4%
2 3968
 
0.2%
3 396
 
< 0.1%
4 65
 
< 0.1%
5 27
 
< 0.1%
6 11
 
< 0.1%
7 6
 
< 0.1%
8 2
 
< 0.1%
9 2
 
< 0.1%
Other values (4) 4
 
< 0.1%
ValueCountFrequency (%)
0 2047534
94.4%
1 117672
 
5.4%
2 3968
 
0.2%
3 396
 
< 0.1%
4 65
 
< 0.1%
5 27
 
< 0.1%
6 11
 
< 0.1%
7 6
 
< 0.1%
8 2
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
27 1
 
< 0.1%
19 1
 
< 0.1%
15 1
 
< 0.1%
13 1
 
< 0.1%
9 2
 
< 0.1%
8 2
 
< 0.1%
7 6
 
< 0.1%
6 11
 
< 0.1%
5 27
< 0.1%
4 65
< 0.1%

NUMBER OF PEDESTRIANS KILLED
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.00077061807
Minimum0
Maximum6
Zeros2168039
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:33.605131image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.02837345
Coefficient of variation (CV)36.819083
Kurtosis2472.9705
Mean0.00077061807
Median Absolute Deviation (MAD)0
Skewness41.282348
Sum1672
Variance0.00080505268
MonotonicityNot monotonic
2025-04-20T15:31:33.627761image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 2168039
99.9%
1 1631
 
0.1%
2 14
 
< 0.1%
4 1
 
< 0.1%
6 1
 
< 0.1%
3 1
 
< 0.1%
ValueCountFrequency (%)
0 2168039
99.9%
1 1631
 
0.1%
2 14
 
< 0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
6 1
 
< 0.1%
ValueCountFrequency (%)
6 1
 
< 0.1%
4 1
 
< 0.1%
3 1
 
< 0.1%
2 14
 
< 0.1%
1 1631
 
0.1%
0 2168039
99.9%

NUMBER OF CYCLIST INJURED
Categorical

Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size120.0 MiB
0
2109480 
1
 
59495
2
 
686
3
 
25
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2169687
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2109480
97.2%
1 59495
 
2.7%
2 686
 
< 0.1%
3 25
 
< 0.1%
4 1
 
< 0.1%

Length

2025-04-20T15:31:33.685205image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T15:31:33.707216image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 2109480
97.2%
1 59495
 
2.7%
2 686
 
< 0.1%
3 25
 
< 0.1%
4 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2109480
97.2%
1 59495
 
2.7%
2 686
 
< 0.1%
3 25
 
< 0.1%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2169687
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2109480
97.2%
1 59495
 
2.7%
2 686
 
< 0.1%
3 25
 
< 0.1%
4 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2169687
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2109480
97.2%
1 59495
 
2.7%
2 686
 
< 0.1%
3 25
 
< 0.1%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2169687
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2109480
97.2%
1 59495
 
2.7%
2 686
 
< 0.1%
3 25
 
< 0.1%
4 1
 
< 0.1%

NUMBER OF CYCLIST KILLED
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size120.0 MiB
0
2169424 
1
 
262
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2169687
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2169424
> 99.9%
1 262
 
< 0.1%
2 1
 
< 0.1%

Length

2025-04-20T15:31:33.834038image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T15:31:33.863276image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0 2169424
> 99.9%
1 262
 
< 0.1%
2 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2169424
> 99.9%
1 262
 
< 0.1%
2 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2169687
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2169424
> 99.9%
1 262
 
< 0.1%
2 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2169687
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2169424
> 99.9%
1 262
 
< 0.1%
2 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2169687
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2169424
> 99.9%
1 262
 
< 0.1%
2 1
 
< 0.1%

NUMBER OF MOTORIST INJURED
Real number (ℝ)

High correlation  Zeros 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.23081209
Minimum0
Maximum43
Zeros1842231
Zeros (%)84.9%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:33.905419image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum43
Range43
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.67103572
Coefficient of variation (CV)2.9072815
Kurtosis59.571808
Mean0.23081209
Median Absolute Deviation (MAD)0
Skewness4.9889082
Sum500790
Variance0.45028894
MonotonicityNot monotonic
2025-04-20T15:31:33.942472image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 1842231
84.9%
1 220044
 
10.1%
2 68504
 
3.2%
3 23905
 
1.1%
4 8916
 
0.4%
5 3444
 
0.2%
6 1407
 
0.1%
7 581
 
< 0.1%
8 263
 
< 0.1%
9 131
 
< 0.1%
Other values (21) 261
 
< 0.1%
ValueCountFrequency (%)
0 1842231
84.9%
1 220044
 
10.1%
2 68504
 
3.2%
3 23905
 
1.1%
4 8916
 
0.4%
5 3444
 
0.2%
6 1407
 
0.1%
7 581
 
< 0.1%
8 263
 
< 0.1%
9 131
 
< 0.1%
ValueCountFrequency (%)
43 1
 
< 0.1%
40 1
 
< 0.1%
34 1
 
< 0.1%
31 1
 
< 0.1%
30 1
 
< 0.1%
25 1
 
< 0.1%
24 3
< 0.1%
23 1
 
< 0.1%
22 2
< 0.1%
21 1
 
< 0.1%

NUMBER OF MOTORIST KILLED
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.00063281017
Minimum0
Maximum5
Zeros2168418
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:33.989963image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.027494185
Coefficient of variation (CV)43.44776
Kurtosis4006.357
Mean0.00063281017
Median Absolute Deviation (MAD)0
Skewness53.530055
Sum1373
Variance0.00075593019
MonotonicityNot monotonic
2025-04-20T15:31:34.031229image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 2168418
99.9%
1 1187
 
0.1%
2 66
 
< 0.1%
3 12
 
< 0.1%
5 2
 
< 0.1%
4 2
 
< 0.1%
ValueCountFrequency (%)
0 2168418
99.9%
1 1187
 
0.1%
2 66
 
< 0.1%
3 12
 
< 0.1%
4 2
 
< 0.1%
5 2
 
< 0.1%
ValueCountFrequency (%)
5 2
 
< 0.1%
4 2
 
< 0.1%
3 12
 
< 0.1%
2 66
 
< 0.1%
1 1187
 
0.1%
0 2168418
99.9%
Distinct61
Distinct (%)< 0.1%
Missing7488
Missing (%)0.3%
Memory size158.1 MiB
2025-04-20T15:31:34.115777image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length53
Median length43
Mean length19.580861
Min length1

Characters and Unicode

Total characters42337719
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAggressive Driving/Road Rage
2nd rowPavement Slippery
3rd rowUnspecified
4th rowFollowing Too Closely
5th rowPassing Too Closely
ValueCountFrequency (%)
unspecified 730368
16.9%
driver 472824
 
10.9%
inattention/distraction 438290
 
10.1%
closely 171246
 
4.0%
too 171246
 
4.0%
to 155546
 
3.6%
failure 136141
 
3.1%
yield 129654
 
3.0%
right-of-way 129654
 
3.0%
passing 116503
 
2.7%
Other values (96) 1674831
38.7%
2025-04-20T15:31:34.274349image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4751528
 
11.2%
e 4307016
 
10.2%
n 3682645
 
8.7%
t 2947416
 
7.0%
o 2505340
 
5.9%
r 2498947
 
5.9%
s 2193313
 
5.2%
2164104
 
5.1%
a 2095204
 
4.9%
c 1621715
 
3.8%
Other values (45) 13570491
32.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 42337719
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 4751528
 
11.2%
e 4307016
 
10.2%
n 3682645
 
8.7%
t 2947416
 
7.0%
o 2505340
 
5.9%
r 2498947
 
5.9%
s 2193313
 
5.2%
2164104
 
5.1%
a 2095204
 
4.9%
c 1621715
 
3.8%
Other values (45) 13570491
32.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 42337719
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 4751528
 
11.2%
e 4307016
 
10.2%
n 3682645
 
8.7%
t 2947416
 
7.0%
o 2505340
 
5.9%
r 2498947
 
5.9%
s 2193313
 
5.2%
2164104
 
5.1%
a 2095204
 
4.9%
c 1621715
 
3.8%
Other values (45) 13570491
32.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 42337719
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 4751528
 
11.2%
e 4307016
 
10.2%
n 3682645
 
8.7%
t 2947416
 
7.0%
o 2505340
 
5.9%
r 2498947
 
5.9%
s 2193313
 
5.2%
2164104
 
5.1%
a 2095204
 
4.9%
c 1621715
 
3.8%
Other values (45) 13570491
32.1%
Distinct61
Distinct (%)< 0.1%
Missing344595
Missing (%)15.9%
Memory size132.5 MiB
2025-04-20T15:31:34.344185image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length53
Median length11
Mean length13.056416
Min length1

Characters and Unicode

Total characters23829160
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 1536370
68.6%
driver 105403
 
4.7%
inattention/distraction 98376
 
4.4%
other 34374
 
1.5%
vehicular 33308
 
1.5%
too 29279
 
1.3%
closely 29279
 
1.3%
passing 22633
 
1.0%
to 22273
 
1.0%
lane 21043
 
0.9%
Other values (96) 308004
 
13.7%
2025-04-20T15:31:34.484987image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3754742
15.8%
e 3656072
15.3%
n 2136161
9.0%
s 1829805
7.7%
c 1734021
7.3%
d 1613337
6.8%
p 1609180
6.8%
f 1595364
6.7%
U 1574713
6.6%
t 645410
 
2.7%
Other values (45) 3680355
15.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 23829160
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 3754742
15.8%
e 3656072
15.3%
n 2136161
9.0%
s 1829805
7.7%
c 1734021
7.3%
d 1613337
6.8%
p 1609180
6.8%
f 1595364
6.7%
U 1574713
6.6%
t 645410
 
2.7%
Other values (45) 3680355
15.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 23829160
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 3754742
15.8%
e 3656072
15.3%
n 2136161
9.0%
s 1829805
7.7%
c 1734021
7.3%
d 1613337
6.8%
p 1609180
6.8%
f 1595364
6.7%
U 1574713
6.6%
t 645410
 
2.7%
Other values (45) 3680355
15.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 23829160
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 3754742
15.8%
e 3656072
15.3%
n 2136161
9.0%
s 1829805
7.7%
c 1734021
7.3%
d 1613337
6.8%
p 1609180
6.8%
f 1595364
6.7%
U 1574713
6.6%
t 645410
 
2.7%
Other values (45) 3680355
15.4%
Distinct53
Distinct (%)< 0.1%
Missing2013111
Missing (%)92.8%
Memory size71.7 MiB
2025-04-20T15:31:34.574142image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length53
Median length11
Mean length11.662183
Min length1

Characters and Unicode

Total characters1826018
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 145876
85.7%
other 3048
 
1.8%
vehicular 3008
 
1.8%
driver 2275
 
1.3%
closely 2148
 
1.3%
too 2148
 
1.3%
following 2089
 
1.2%
inattention/distraction 2079
 
1.2%
fatigued/drowsy 855
 
0.5%
pavement 434
 
0.3%
Other values (82) 6286
 
3.7%
2025-04-20T15:31:34.721757image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 311985
17.1%
i 310458
17.0%
n 160122
8.8%
s 153239
8.4%
c 152712
8.4%
d 148071
8.1%
p 147649
8.1%
f 146846
8.0%
U 146602
8.0%
o 18344
 
1.0%
Other values (45) 129990
7.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1826018
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 311985
17.1%
i 310458
17.0%
n 160122
8.8%
s 153239
8.4%
c 152712
8.4%
d 148071
8.1%
p 147649
8.1%
f 146846
8.0%
U 146602
8.0%
o 18344
 
1.0%
Other values (45) 129990
7.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1826018
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 311985
17.1%
i 310458
17.0%
n 160122
8.8%
s 153239
8.4%
c 152712
8.4%
d 148071
8.1%
p 147649
8.1%
f 146846
8.0%
U 146602
8.0%
o 18344
 
1.0%
Other values (45) 129990
7.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1826018
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 311985
17.1%
i 310458
17.0%
n 160122
8.8%
s 153239
8.4%
c 152712
8.4%
d 148071
8.1%
p 147649
8.1%
f 146846
8.0%
U 146602
8.0%
o 18344
 
1.0%
Other values (45) 129990
7.1%

CONTRIBUTING FACTOR VEHICLE 4
Categorical

High correlation  Imbalance  Missing 

Distinct43
Distinct (%)0.1%
Missing2133997
Missing (%)98.4%
Memory size132.6 MiB
Unspecified
33655 
Other Vehicular
 
678
Following Too Closely
 
413
Driver Inattention/Distraction
 
294
Fatigued/Drowsy
 
170
Other values (38)
 
480

Length

Max length43
Median length11
Mean length11.491734
Min length5

Characters and Unicode

Total characters410140
Distinct characters52
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified

Common Values

ValueCountFrequency (%)
Unspecified 33655
 
1.6%
Other Vehicular 678
 
< 0.1%
Following Too Closely 413
 
< 0.1%
Driver Inattention/Distraction 294
 
< 0.1%
Fatigued/Drowsy 170
 
< 0.1%
Pavement Slippery 125
 
< 0.1%
Reaction to Uninvolved Vehicle 44
 
< 0.1%
Unsafe Speed 34
 
< 0.1%
Driver Inexperience 31
 
< 0.1%
Outside Car Distraction 31
 
< 0.1%
Other values (33) 215
 
< 0.1%
(Missing) 2133997
98.4%

Length

2025-04-20T15:31:34.778872image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unspecified 33655
88.1%
other 687
 
1.8%
vehicular 678
 
1.8%
too 418
 
1.1%
closely 418
 
1.1%
following 413
 
1.1%
driver 325
 
0.9%
inattention/distraction 294
 
0.8%
fatigued/drowsy 170
 
0.4%
pavement 129
 
0.3%
Other values (67) 1035
 
2.7%

Most occurring characters

ValueCountFrequency (%)
e 71167
17.4%
i 70477
17.2%
n 35875
8.7%
c 34920
8.5%
s 34868
8.5%
p 34047
8.3%
d 34019
8.3%
f 33788
8.2%
U 33771
8.2%
o 3254
 
0.8%
Other values (42) 23954
 
5.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 410140
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 71167
17.4%
i 70477
17.2%
n 35875
8.7%
c 34920
8.5%
s 34868
8.5%
p 34047
8.3%
d 34019
8.3%
f 33788
8.2%
U 33771
8.2%
o 3254
 
0.8%
Other values (42) 23954
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 410140
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 71167
17.4%
i 70477
17.2%
n 35875
8.7%
c 34920
8.5%
s 34868
8.5%
p 34047
8.3%
d 34019
8.3%
f 33788
8.2%
U 33771
8.2%
o 3254
 
0.8%
Other values (42) 23954
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 410140
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 71167
17.4%
i 70477
17.2%
n 35875
8.7%
c 34920
8.5%
s 34868
8.5%
p 34047
8.3%
d 34019
8.3%
f 33788
8.2%
U 33771
8.2%
o 3254
 
0.8%
Other values (42) 23954
 
5.8%

CONTRIBUTING FACTOR VEHICLE 5
Categorical

High correlation  Imbalance  Missing 

Distinct32
Distinct (%)0.3%
Missing2159928
Missing (%)99.6%
Memory size132.5 MiB
Unspecified
9199 
Other Vehicular
 
197
Following Too Closely
 
107
Driver Inattention/Distraction
 
68
Pavement Slippery
 
53
Other values (27)
 
135

Length

Max length43
Median length11
Mean length11.467056
Min length5

Characters and Unicode

Total characters111907
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified

Common Values

ValueCountFrequency (%)
Unspecified 9199
 
0.4%
Other Vehicular 197
 
< 0.1%
Following Too Closely 107
 
< 0.1%
Driver Inattention/Distraction 68
 
< 0.1%
Pavement Slippery 53
 
< 0.1%
Fatigued/Drowsy 41
 
< 0.1%
Reaction to Uninvolved Vehicle 12
 
< 0.1%
Obstruction/Debris 11
 
< 0.1%
Alcohol Involvement 11
 
< 0.1%
Driver Inexperience 10
 
< 0.1%
Other values (22) 50
 
< 0.1%
(Missing) 2159928
99.6%

Length

2025-04-20T15:31:34.844716image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unspecified 9199
88.2%
other 199
 
1.9%
vehicular 197
 
1.9%
too 109
 
1.0%
closely 109
 
1.0%
following 107
 
1.0%
driver 78
 
0.7%
inattention/distraction 68
 
0.7%
pavement 54
 
0.5%
slippery 53
 
0.5%
Other values (50) 260
 
2.5%

Most occurring characters

ValueCountFrequency (%)
e 19487
17.4%
i 19225
17.2%
n 9758
8.7%
c 9546
8.5%
s 9489
8.5%
p 9333
8.3%
d 9285
8.3%
f 9227
8.2%
U 9222
8.2%
o 837
 
0.7%
Other values (41) 6498
 
5.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 111907
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 19487
17.4%
i 19225
17.2%
n 9758
8.7%
c 9546
8.5%
s 9489
8.5%
p 9333
8.3%
d 9285
8.3%
f 9227
8.2%
U 9222
8.2%
o 837
 
0.7%
Other values (41) 6498
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 111907
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 19487
17.4%
i 19225
17.2%
n 9758
8.7%
c 9546
8.5%
s 9489
8.5%
p 9333
8.3%
d 9285
8.3%
f 9227
8.2%
U 9222
8.2%
o 837
 
0.7%
Other values (41) 6498
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 111907
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 19487
17.4%
i 19225
17.2%
n 9758
8.7%
c 9546
8.5%
s 9489
8.5%
p 9333
8.3%
d 9285
8.3%
f 9227
8.2%
U 9222
8.2%
o 837
 
0.7%
Other values (41) 6498
 
5.8%

COLLISION_ID
Real number (ℝ)

Unique 

Distinct2169687
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3229115.9
Minimum22
Maximum4806433
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 MiB
2025-04-20T15:31:34.907286image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile109347.3
Q13178541.5
median3721118
Q34263766.5
95-th percentile4697771.7
Maximum4806433
Range4806411
Interquartile range (IQR)1085225

Descriptive statistics

Standard deviation1507781.8
Coefficient of variation (CV)0.46693332
Kurtosis0.091475235
Mean3229115.9
Median Absolute Deviation (MAD)542613
Skewness-1.2508025
Sum7.0061708 × 1012
Variance2.273406 × 1012
MonotonicityNot monotonic
2025-04-20T15:31:34.955502image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4806253 1
 
< 0.1%
4455765 1
 
< 0.1%
4513547 1
 
< 0.1%
4675373 1
 
< 0.1%
4541903 1
 
< 0.1%
4566131 1
 
< 0.1%
4623759 1
 
< 0.1%
4675709 1
 
< 0.1%
4675769 1
 
< 0.1%
4623865 1
 
< 0.1%
Other values (2169677) 2169677
> 99.9%
ValueCountFrequency (%)
22 1
< 0.1%
23 1
< 0.1%
24 1
< 0.1%
25 1
< 0.1%
26 1
< 0.1%
27 1
< 0.1%
28 1
< 0.1%
29 1
< 0.1%
30 1
< 0.1%
31 1
< 0.1%
ValueCountFrequency (%)
4806433 1
< 0.1%
4806432 1
< 0.1%
4806429 1
< 0.1%
4806428 1
< 0.1%
4806426 1
< 0.1%
4806425 1
< 0.1%
4806423 1
< 0.1%
4806422 1
< 0.1%
4806409 1
< 0.1%
4806408 1
< 0.1%
Distinct1768
Distinct (%)0.1%
Missing15354
Missing (%)0.7%
Memory size152.2 MiB
2025-04-20T15:31:35.050508image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length38
Median length35
Mean length16.845762
Min length1

Characters and Unicode

Total characters36291380
Distinct characters77
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1070 ?
Unique (%)< 0.1%

Sample

1st rowSedan
2nd rowSedan
3rd rowMoped
4th rowSedan
5th rowStation Wagon/Sport Utility Vehicle
ValueCountFrequency (%)
vehicle 912569
18.0%
utility 666102
13.1%
station 666056
13.1%
sedan 661932
13.0%
wagon/sport 485764
9.6%
passenger 416225
8.2%
181771
 
3.6%
wagon 180357
 
3.6%
sport 180291
 
3.6%
truck 90588
 
1.8%
Other values (1020) 636005
12.5%
2025-04-20T15:31:35.192506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2936547
 
8.1%
S 2843291
 
7.8%
t 2465415
 
6.8%
i 2076116
 
5.7%
E 1820373
 
5.0%
a 1733286
 
4.8%
e 1727059
 
4.8%
n 1657129
 
4.6%
o 1540424
 
4.2%
T 1150164
 
3.2%
Other values (67) 16341576
45.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 36291380
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2936547
 
8.1%
S 2843291
 
7.8%
t 2465415
 
6.8%
i 2076116
 
5.7%
E 1820373
 
5.0%
a 1733286
 
4.8%
e 1727059
 
4.8%
n 1657129
 
4.6%
o 1540424
 
4.2%
T 1150164
 
3.2%
Other values (67) 16341576
45.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 36291380
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2936547
 
8.1%
S 2843291
 
7.8%
t 2465415
 
6.8%
i 2076116
 
5.7%
E 1820373
 
5.0%
a 1733286
 
4.8%
e 1727059
 
4.8%
n 1657129
 
4.6%
o 1540424
 
4.2%
T 1150164
 
3.2%
Other values (67) 16341576
45.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 36291380
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2936547
 
8.1%
S 2843291
 
7.8%
t 2465415
 
6.8%
i 2076116
 
5.7%
E 1820373
 
5.0%
a 1733286
 
4.8%
e 1727059
 
4.8%
n 1657129
 
4.6%
o 1540424
 
4.2%
T 1150164
 
3.2%
Other values (67) 16341576
45.0%

VEHICLE TYPE CODE 2
Text

Missing 

Distinct1979
Distinct (%)0.1%
Missing428782
Missing (%)19.8%
Memory size134.3 MiB
2025-04-20T15:31:35.277100image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length38
Median length30
Mean length16.034862
Min length1

Characters and Unicode

Total characters27915172
Distinct characters73
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1186 ?
Unique (%)0.1%

Sample

1st rowSedan
2nd rowSedan
3rd rowPick-up Truck
4th rowBox Truck
5th rowStation Wagon/Sport Utility Vehicle
ValueCountFrequency (%)
vehicle 672305
17.0%
utility 485333
12.3%
station 485301
12.3%
sedan 460130
11.6%
wagon/sport 345097
8.7%
passenger 318614
8.1%
141646
 
3.6%
wagon 140262
 
3.5%
sport 140204
 
3.5%
truck 90131
 
2.3%
Other values (1078) 676997
17.1%
2025-04-20T15:31:35.417634image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2228083
 
8.0%
S 2093987
 
7.5%
t 1762335
 
6.3%
i 1515114
 
5.4%
E 1440961
 
5.2%
e 1263386
 
4.5%
a 1231686
 
4.4%
n 1170611
 
4.2%
o 1124916
 
4.0%
T 926688
 
3.3%
Other values (63) 13157405
47.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 27915172
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2228083
 
8.0%
S 2093987
 
7.5%
t 1762335
 
6.3%
i 1515114
 
5.4%
E 1440961
 
5.2%
e 1263386
 
4.5%
a 1231686
 
4.4%
n 1170611
 
4.2%
o 1124916
 
4.0%
T 926688
 
3.3%
Other values (63) 13157405
47.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 27915172
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2228083
 
8.0%
S 2093987
 
7.5%
t 1762335
 
6.3%
i 1515114
 
5.4%
E 1440961
 
5.2%
e 1263386
 
4.5%
a 1231686
 
4.4%
n 1170611
 
4.2%
o 1124916
 
4.0%
T 926688
 
3.3%
Other values (63) 13157405
47.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 27915172
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2228083
 
8.0%
S 2093987
 
7.5%
t 1762335
 
6.3%
i 1515114
 
5.4%
E 1440961
 
5.2%
e 1263386
 
4.5%
a 1231686
 
4.4%
n 1170611
 
4.2%
o 1124916
 
4.0%
T 926688
 
3.3%
Other values (63) 13157405
47.1%

VEHICLE TYPE CODE 3
Text

Missing 

Distinct286
Distinct (%)0.2%
Missing2019068
Missing (%)93.1%
Memory size72.3 MiB
2025-04-20T15:31:35.479675image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length35
Median length30
Mean length17.66163
Min length2

Characters and Unicode

Total characters2660177
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique171 ?
Unique (%)0.1%

Sample

1st rowSedan
2nd rowSedan
3rd rowStation Wagon/Sport Utility Vehicle
4th rowSedan
5th rowSedan
ValueCountFrequency (%)
vehicle 67310
18.5%
utility 52522
14.4%
station 52519
14.4%
sedan 50913
14.0%
wagon/sport 39160
10.7%
passenger 27716
7.6%
13450
 
3.7%
wagon 13359
 
3.7%
sport 13358
 
3.7%
truck 4722
 
1.3%
Other values (231) 29493
8.1%
2025-04-20T15:31:35.595803image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
214338
 
8.1%
S 210484
 
7.9%
t 197342
 
7.4%
i 163015
 
6.1%
a 133085
 
5.0%
e 132690
 
5.0%
n 130214
 
4.9%
o 120783
 
4.5%
E 116444
 
4.4%
l 79839
 
3.0%
Other values (52) 1161943
43.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2660177
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
214338
 
8.1%
S 210484
 
7.9%
t 197342
 
7.4%
i 163015
 
6.1%
a 133085
 
5.0%
e 132690
 
5.0%
n 130214
 
4.9%
o 120783
 
4.5%
E 116444
 
4.4%
l 79839
 
3.0%
Other values (52) 1161943
43.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2660177
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
214338
 
8.1%
S 210484
 
7.9%
t 197342
 
7.4%
i 163015
 
6.1%
a 133085
 
5.0%
e 132690
 
5.0%
n 130214
 
4.9%
o 120783
 
4.5%
E 116444
 
4.4%
l 79839
 
3.0%
Other values (52) 1161943
43.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2660177
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
214338
 
8.1%
S 210484
 
7.9%
t 197342
 
7.4%
i 163015
 
6.1%
a 133085
 
5.0%
e 132690
 
5.0%
n 130214
 
4.9%
o 120783
 
4.5%
E 116444
 
4.4%
l 79839
 
3.0%
Other values (52) 1161943
43.7%

VEHICLE TYPE CODE 4
Text

Missing 

Distinct111
Distinct (%)0.3%
Missing2135279
Missing (%)98.4%
Memory size67.6 MiB
2025-04-20T15:31:35.659031image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length35
Median length30
Mean length18.02607
Min length2

Characters and Unicode

Total characters620241
Distinct characters58
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)0.2%

Sample

1st rowStation Wagon/Sport Utility Vehicle
2nd rowSedan
3rd rowStation Wagon/Sport Utility Vehicle
4th rowSedan
5th rowSedan
ValueCountFrequency (%)
vehicle 15835
18.9%
station 12661
15.1%
utility 12661
15.1%
sedan 12388
14.7%
wagon/sport 9809
11.7%
passenger 5970
 
7.1%
2862
 
3.4%
sport 2852
 
3.4%
wagon 2852
 
3.4%
truck 868
 
1.0%
Other values (110) 5231
 
6.2%
2025-04-20T15:31:35.777550image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49637
 
8.0%
S 49284
 
7.9%
t 49284
 
7.9%
i 40437
 
6.5%
a 32723
 
5.3%
e 32519
 
5.2%
n 32170
 
5.2%
o 29952
 
4.8%
E 24673
 
4.0%
l 19876
 
3.2%
Other values (48) 259686
41.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 620241
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
49637
 
8.0%
S 49284
 
7.9%
t 49284
 
7.9%
i 40437
 
6.5%
a 32723
 
5.3%
e 32519
 
5.2%
n 32170
 
5.2%
o 29952
 
4.8%
E 24673
 
4.0%
l 19876
 
3.2%
Other values (48) 259686
41.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 620241
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
49637
 
8.0%
S 49284
 
7.9%
t 49284
 
7.9%
i 40437
 
6.5%
a 32723
 
5.3%
e 32519
 
5.2%
n 32170
 
5.2%
o 29952
 
4.8%
E 24673
 
4.0%
l 19876
 
3.2%
Other values (48) 259686
41.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 620241
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
49637
 
8.0%
S 49284
 
7.9%
t 49284
 
7.9%
i 40437
 
6.5%
a 32723
 
5.3%
e 32519
 
5.2%
n 32170
 
5.2%
o 29952
 
4.8%
E 24673
 
4.0%
l 19876
 
3.2%
Other values (48) 259686
41.9%

VEHICLE TYPE CODE 5
Text

Missing 

Distinct75
Distinct (%)0.8%
Missing2160231
Missing (%)99.6%
Memory size66.6 MiB
2025-04-20T15:31:35.838376image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length35
Median length30
Mean length18.181684
Min length2

Characters and Unicode

Total characters171926
Distinct characters56
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)0.4%

Sample

1st rowStation Wagon/Sport Utility Vehicle
2nd rowStation Wagon/Sport Utility Vehicle
3rd rowSedan
4th rowSedan
5th rowStation Wagon/Sport Utility Vehicle
ValueCountFrequency (%)
vehicle 4294
18.5%
utility 3600
15.5%
station 3600
15.5%
sedan 3516
15.1%
wagon/sport 2798
12.0%
passenger 1487
 
6.4%
804
 
3.5%
wagon 804
 
3.5%
sport 802
 
3.5%
truck 272
 
1.2%
Other values (75) 1257
 
5.4%
2025-04-20T15:31:35.982308image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 14069
 
8.2%
13788
 
8.0%
S 13608
 
7.9%
i 11540
 
6.7%
a 9308
 
5.4%
e 9257
 
5.4%
n 9176
 
5.3%
o 8568
 
5.0%
E 6130
 
3.6%
l 5672
 
3.3%
Other values (46) 70810
41.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 171926
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 14069
 
8.2%
13788
 
8.0%
S 13608
 
7.9%
i 11540
 
6.7%
a 9308
 
5.4%
e 9257
 
5.4%
n 9176
 
5.3%
o 8568
 
5.0%
E 6130
 
3.6%
l 5672
 
3.3%
Other values (46) 70810
41.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 171926
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 14069
 
8.2%
13788
 
8.0%
S 13608
 
7.9%
i 11540
 
6.7%
a 9308
 
5.4%
e 9257
 
5.4%
n 9176
 
5.3%
o 8568
 
5.0%
E 6130
 
3.6%
l 5672
 
3.3%
Other values (46) 70810
41.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 171926
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 14069
 
8.2%
13788
 
8.0%
S 13608
 
7.9%
i 11540
 
6.7%
a 9308
 
5.4%
e 9257
 
5.4%
n 9176
 
5.3%
o 8568
 
5.0%
E 6130
 
3.6%
l 5672
 
3.3%
Other values (46) 70810
41.2%

Interactions

2025-04-20T15:31:14.695684image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:55.486384image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:57.798976image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:00.121626image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:02.605892image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:05.085754image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:07.486092image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:09.874901image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:12.296373image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:14.947201image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:55.757018image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:58.036138image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:00.387723image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:02.860737image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:05.344106image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:07.743902image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:10.130742image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:12.552635image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:15.240911image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:56.009777image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:58.295163image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:00.649225image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:03.140775image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:05.647942image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:08.029032image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:10.431716image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:12.826595image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:15.521293image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:56.257501image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:58.541686image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:00.930878image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:03.399797image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:05.926640image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:08.320369image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:10.702705image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:13.113740image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:15.796471image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:56.524851image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:58.814634image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:01.216234image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:03.695980image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:06.193035image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:08.580191image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:10.980957image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:13.399512image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:16.048709image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:56.787074image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:59.070754image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:01.505296image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:03.978855image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:06.454133image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:08.843275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:11.236937image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:13.662545image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:16.431010image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:57.031077image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:59.331700image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:01.790425image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:04.267658image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:06.702108image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:09.098608image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:11.509575image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:13.926464image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:16.686765image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:57.298944image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:59.602497image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:02.071266image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:04.546745image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:06.974251image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:09.352345image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:11.772452image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:14.164209image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:16.948014image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:57.544347image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:30:59.864017image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:02.359972image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:04.836523image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:07.234206image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:09.618916image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:12.043056image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-04-20T15:31:14.427392image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-04-20T15:31:36.020163image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
BOROUGHCOLLISION_IDCONTRIBUTING FACTOR VEHICLE 4CONTRIBUTING FACTOR VEHICLE 5LATITUDELONGITUDENUMBER OF CYCLIST INJUREDNUMBER OF CYCLIST KILLEDNUMBER OF MOTORIST INJUREDNUMBER OF MOTORIST KILLEDNUMBER OF PEDESTRIANS INJUREDNUMBER OF PEDESTRIANS KILLEDNUMBER OF PERSONS INJUREDNUMBER OF PERSONS KILLED
BOROUGH1.0000.0540.0500.0450.0060.0060.0280.0010.0080.0040.0020.0000.0080.002
COLLISION_ID0.0541.0000.0670.078-0.0160.0680.0390.0040.1170.0080.0390.0050.1530.011
CONTRIBUTING FACTOR VEHICLE 40.0500.0671.0000.6900.0000.0000.0000.0000.0220.0000.1430.0000.0250.000
CONTRIBUTING FACTOR VEHICLE 50.0450.0780.6901.0000.0000.0000.0000.0000.0400.0000.0000.0000.0380.000
LATITUDE0.006-0.0160.0000.0001.0000.2850.0030.000-0.032-0.0010.003-0.001-0.026-0.001
LONGITUDE0.0060.0680.0000.0000.2851.0000.0020.0000.0750.006-0.0140.0010.0390.003
NUMBER OF CYCLIST INJURED0.0280.0390.0000.0000.0030.0021.0000.0180.0040.0010.0000.0020.0040.005
NUMBER OF CYCLIST KILLED0.0010.0040.0000.0000.0000.0000.0181.0000.0000.0000.1620.7070.0400.736
NUMBER OF MOTORIST INJURED0.0080.1170.0220.040-0.0320.0750.0040.0001.0000.018-0.092-0.0030.7810.008
NUMBER OF MOTORIST KILLED0.0040.0080.0000.000-0.0010.0060.0010.0000.0181.000-0.0040.0030.0120.623
NUMBER OF PEDESTRIANS INJURED0.0020.0390.1430.0000.003-0.0140.0000.162-0.092-0.0041.0000.0020.412-0.002
NUMBER OF PEDESTRIANS KILLED0.0000.0050.0000.000-0.0010.0010.0020.707-0.0030.0030.0021.000-0.0050.714
NUMBER OF PERSONS INJURED0.0080.1530.0250.038-0.0260.0390.0040.0400.7810.0120.412-0.0051.0000.003
NUMBER OF PERSONS KILLED0.0020.0110.0000.000-0.0010.0030.0050.7360.0080.623-0.0020.7140.0031.000

Missing values

2025-04-20T15:31:17.752104image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-20T15:31:20.307627image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-04-20T15:31:28.086088image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

CRASH DATECRASH TIMEBOROUGHZIP CODELATITUDELONGITUDELOCATIONON STREET NAMECROSS STREET NAMEOFF STREET NAMENUMBER OF PERSONS INJUREDNUMBER OF PERSONS KILLEDNUMBER OF PEDESTRIANS INJUREDNUMBER OF PEDESTRIANS KILLEDNUMBER OF CYCLIST INJUREDNUMBER OF CYCLIST KILLEDNUMBER OF MOTORIST INJUREDNUMBER OF MOTORIST KILLEDCONTRIBUTING FACTOR VEHICLE 1CONTRIBUTING FACTOR VEHICLE 2CONTRIBUTING FACTOR VEHICLE 3CONTRIBUTING FACTOR VEHICLE 4CONTRIBUTING FACTOR VEHICLE 5COLLISION_IDVEHICLE TYPE CODE 1VEHICLE TYPE CODE 2VEHICLE TYPE CODE 3VEHICLE TYPE CODE 4VEHICLE TYPE CODE 5
009/11/20212:39NaNNaNNaNNaNNaNWHITESTONE EXPRESSWAY20 AVENUENaN2.00.0000020Aggressive Driving/Road RageUnspecifiedNaNNaNNaN4455765SedanSedanNaNNaNNaN
103/26/202211:45NaNNaNNaNNaNNaNQUEENSBORO BRIDGE UPPERNaNNaN1.00.0000010Pavement SlipperyNaNNaNNaNNaN4513547SedanNaNNaNNaNNaN
211/01/20231:29BROOKLYN11230.040.621790-73.970024(40.62179, -73.970024)OCEAN PARKWAYAVENUE KNaN1.00.0000010UnspecifiedUnspecifiedUnspecifiedNaNNaN4675373MopedSedanSedanNaNNaN
306/29/20226:55NaNNaNNaNNaNNaNTHROGS NECK BRIDGENaNNaN0.00.0000000Following Too CloselyUnspecifiedNaNNaNNaN4541903SedanPick-up TruckNaNNaNNaN
409/21/202213:21NaNNaNNaNNaNNaNBROOKLYN BRIDGENaNNaN0.00.0000000Passing Too CloselyUnspecifiedNaNNaNNaN4566131Station Wagon/Sport Utility VehicleNaNNaNNaNNaN
504/26/202313:30NaNNaNNaNNaNNaNWEST 54 STREETNaNNaN0.00.0000000UnspecifiedUnspecifiedNaNNaNNaN4623759SedanBox TruckNaNNaNNaN
611/01/20237:12NaNNaNNaNNaNNaNHUTCHINSON RIVER PARKWAYNaNNaN0.00.0000000Following Too CloselyDriver Inattention/DistractionNaNNaNNaN4675709SedanStation Wagon/Sport Utility VehicleNaNNaNNaN
711/01/20238:01NaNNaNNaNNaNNaNWEST 35 STREETHENRY HUDSON RIVERNaN0.00.0000000Failure to Yield Right-of-WayNaNNaNNaNNaN4675769SedanNaNNaNNaNNaN
804/26/202322:20NaNNaNNaNNaNNaNNaNNaN61 Ed Koch queensborough bridge0.00.0000000UnspecifiedNaNNaNNaNNaN4623865SedanPick-up TruckNaNNaNNaN
909/11/20219:35BROOKLYN11208.040.667202-73.866500(40.667202, -73.8665)NaNNaN1211 LORING AVENUE0.00.0000000UnspecifiedNaNNaNNaNNaN4456314SedanNaNNaNNaNNaN
CRASH DATECRASH TIMEBOROUGHZIP CODELATITUDELONGITUDELOCATIONON STREET NAMECROSS STREET NAMEOFF STREET NAMENUMBER OF PERSONS INJUREDNUMBER OF PERSONS KILLEDNUMBER OF PEDESTRIANS INJUREDNUMBER OF PEDESTRIANS KILLEDNUMBER OF CYCLIST INJUREDNUMBER OF CYCLIST KILLEDNUMBER OF MOTORIST INJUREDNUMBER OF MOTORIST KILLEDCONTRIBUTING FACTOR VEHICLE 1CONTRIBUTING FACTOR VEHICLE 2CONTRIBUTING FACTOR VEHICLE 3CONTRIBUTING FACTOR VEHICLE 4CONTRIBUTING FACTOR VEHICLE 5COLLISION_IDVEHICLE TYPE CODE 1VEHICLE TYPE CODE 2VEHICLE TYPE CODE 3VEHICLE TYPE CODE 4VEHICLE TYPE CODE 5
216967704/15/202515:52BRONX10461.040.854298-73.85492(40.854298, -73.85492)NaNNaN2007 WILLIAMSBRIDGE RD0.00.0000000Driver Inattention/DistractionUnspecifiedNaNNaNNaN4805948SedanSedanNaNNaNNaN
216967804/15/202520:00QUEENS11366.040.728012-73.78483(40.728012, -73.78483)UNION TPKE184 STNaN2.00.0000020Traffic Control DisregardedUnspecifiedNaNNaNNaN4806383Station Wagon/Sport Utility VehicleStation Wagon/Sport Utility VehicleNaNNaNNaN
216967904/15/202514:30MANHATTAN10036.040.757553-73.98551(40.757553, -73.98551)NaNNaN1516 BROADWAY1.00.0100000Traffic Control DisregardedNaNNaNNaNNaN4806096Station Wagon/Sport Utility VehicleNaNNaNNaNNaN
216968004/15/202523:20QUEENS11691.040.610480-73.75028(40.61048, -73.75028)NaNNaN12-50 REDFERN AVE0.00.0000000View Obstructed/LimitedNaNNaNNaNNaN4806081Station Wagon/Sport Utility VehicleNaNNaNNaNNaN
216968104/07/20258:50BROOKLYN11221.040.695114-73.91186(40.695114, -73.91186)PUTNAM AVEKNICKERBOCKER AVENaN0.00.0000000Backing UnsafelyUnspecifiedNaNNaNNaN4806432SedanNaNNaNNaNNaN
216968204/15/20255:58NaNNaN40.761272-73.95571(40.761272, -73.95571)FDR DRIVENaNNaN2.00.0100010Driver Inattention/DistractionUnspecifiedNaNNaNNaN4806221Station Wagon/Sport Utility VehicleStation Wagon/Sport Utility VehicleNaNNaNNaN
216968304/14/202519:22STATEN ISLAND10304.040.601810-74.09283(40.60181, -74.09283)RICHMOND RDROME AVENaN0.00.0000000Driver Inattention/DistractionNaNNaNNaNNaN4806275NaNNaNNaNNaNNaN
216968404/14/202521:25QUEENS11436.040.675716-73.79124(40.675716, -73.79124)NaNNaN147-06 123 AVE0.00.0000000Turning ImproperlyUnspecifiedNaNNaNNaN4806294Station Wagon/Sport Utility VehicleNaNNaNNaNNaN
216968504/15/202513:56MANHATTAN10000.00.0000000.00000(0.0, 0.0)NaNNaN90-02 EAST DR0.00.0000000Following Too CloselyUnspecifiedNaNNaNNaN4806171BikeBikeNaNNaNNaN
216968603/23/202513:00BRONX10462.040.836330-73.85505(40.83633, -73.85505)NaNNaN1502 OLMSTEAD AVE0.00.0000000UnspecifiedUnspecifiedNaNNaNNaN4806253SedanNaNNaNNaNNaN